-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add accuracy note to gen_bool #347
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively maybe we should use HighPrecision01
here and add a gen_bool_const
variant with this method? I don't really like having two methods for the same job, but each has its advantage/disadvantage.
src/lib.rs
Outdated
/// # Accuracy note | ||
/// | ||
/// `gen_bool` uses 32 bits of the RNG, so if you use it to generate close | ||
/// to or more than 2^32 results, a tiny bias may becomes noticable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
become
src/lib.rs
Outdated
/// `gen_bool` uses 32 bits of the RNG, so if you use it to generate close | ||
/// to or more than 2^32 results, a tiny bias may becomes noticable. | ||
/// A notable consequence of the method used here is that the worst case is | ||
/// `rng.gen_bool(0.0)`: it has a chance of 1 in 2^32 of being true, while |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I told you to use <=
after examining the upper bound but not the lower. I guess it's better to use <
then and ensure gen_bool(0.0)
is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we would shift to problem to gen_bool(1.0)
. But I see our implementation as reasonable. We could alternatively just always return false
for gen_bool(0.0)
. But I wouldn't bother, using gen_bool
like this is really nor very sensible (as I tried to wr3ite in the first comment, and the comment in the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To top it off, some generators are explicitly designed not to produce 0. We could use (rng.gen() - 1) < p_int
, but I guess that's slower?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 Good point. Nothing seems perfect any more once you know too many details...
That would have to use a wrapping subtraction, and would also only shift the problem from 0.0
to 1.0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it should use rejection sampling — but then there's no point shifting. I agree, we're thinking too much into this 😆
I am starting to think that implementing a different method as the |
Ready to merge? (seems like the latest nightly broke stdweb, and somehow we try to use stdweb on two builders now... #352) |
Well, apart from the spelling error, it's an improvement, though I'm not entirely happy with it. |
26ec612
to
ca4722b
Compare
I remember changing and pushing that. No idea what happened 😄. What is the part you are not entirely happy with? That there is a situation where it is not perfect? |
Yes. And that we have a more accurate approach which is just as fast (according to my benchmarking) when |
The way I see it is that with an accuracy of 32 bits, we can be off by 2^-33. And with rounding, this becomes visible in the 0.0 case.
I was thinking just the opposite 😄, thinking about the current uses of But what did you think about also adding a bernoulli distribution to add the more precise method? |
I was thinking of sim work I've done in the past where many probabilities are fixed, but read from config files, and many more are derived. But I don't know what the average use is! What, |
No, I didn't have an extra method on |
Add accuracy note to gen_bool
A first change to the documentation in reply to @huonw's comment on Reddit.
I don't think adding an extra check for
0.0
ingen_bool
is worth it. Ifp
is variable and the code expects to receive bothtrue
andfalse
, there is no problem, we are just at the limit of the accuracy. If the goal is to consistently generatefalse
by using0.0
as a constant, why are you usinggen_bool
?